The purpose of this analysis is to find synergy biomarkers using the emba R package in the cascade boolean model datasets (Cascade is the name of the topology used).
Particularly, we will investigate \(4\) synergies: AK-PD, PD-PI, PD-G2, PI-D1 and try to find causal mechanisms that might explain why and where (pathways) these synergies manifest.
Note that AK is an AKT inhibitor and PI is a PI3K inhibitor and they both target the PI3K/AKT/mTOR pathway. The combination of such an inhibitor with a MEK inhibitor - PD - that targets the MAPK/ERK pathway, has proven to be more effective than the single drug treatment during clinical trials with patients that had advanced colorectal carcinoma (source).
The G2 drug is a PDPK1 inhibitor and the D1 drug is an inhibitor of RSK isoforms (RPS6KA1, RPS6KA3, RPS6KA2, RPS6KA6).
The boolean model datasets are in total \(9\): one for each cell line
of interest (8 cell lines) where the models were fitted to a specific steady state in each
case and one for the so-called random models which were generated randomly in
the sense that were fitted only to a proliferation state (simulations were done using
the DrugLogics software modules Gitsbe and Drabme).
Loading libraries:
library(emba)
library(usefun)
library(ComplexHeatmap)
library(circlize)
library(dplyr)
library(tibble)
library(DT)
library(ggpubr)
library(Ckmeans.1d.dp)
We load the cell-specific input data:
# Cell Lines
cell.lines = c("A498", "AGS", "DU145", "colo205", "SW620", "SF295", "UACC62", "MDA-MB-468")
cell.line.dirs = sapply(cell.lines, function(cell.line) {
paste0(getwd(), "/", cell.line)
})
# Model predictions
model.predictions.files = sapply(cell.line.dirs, function(cell.line.dir) {
paste0(cell.line.dir, "/model_predictions")
})
model.predictions.per.cell.line = lapply(model.predictions.files, function(file) {
get_model_predictions(file)
})
# Observed synergies
observed.synergies.files = sapply(cell.line.dirs, function(cell.line.dir) {
paste0(cell.line.dir, "/observed_synergies")
})
observed.synergies.per.cell.line = lapply(observed.synergies.files, function(file) {
get_observed_synergies(file)
})
# Models Stable State (1 per model)
models.stable.state.files = sapply(cell.line.dirs, function(cell.line.dir) {
paste0(cell.line.dir, "/models_stable_state")
})
models.stable.state.per.cell.line = lapply(models.stable.state.files, function(file) {
as.matrix(read.table(file, check.names = FALSE))
})
# Models Link Operators
models.link.operator.files = sapply(cell.line.dirs, function(cell.line.dir) {
paste0(cell.line.dir, "/models_link_operator")
})
models.link.operators.per.cell.line = lapply(models.link.operator.files, function(file) {
as.matrix(read.table(file, check.names = FALSE))
})
The random model input data:
random.dir = paste0(getwd(), "/random")
random.model.predictions = get_model_predictions(paste0(random.dir, "/model_predictions"))
random.models.stable.state = as.matrix(
read.table(file = paste0(random.dir, "/models_stable_state"), check.names = FALSE)
)
random.models.link.operator =
as.matrix(read.table(file = paste0(random.dir, "/models_link_operator"), check.names = FALSE))
# the node names used in our analysis
node.names = colnames(random.models.stable.state)
# the tested drug comtions
drug.combos = colnames(random.model.predictions)
Using the generic function biomarker_synergy_analysis from the emba R package, we can find
synergy biomarkers i.e. nodes whose activity and boolean equation parameterization (link operator) affect the manifestation of synergies in the boolean models. Models are classified based on whether they predict or not each of the predicted synergies found in each boolean dataset.
First we run the analysis on the cell-specific boolean model datasets (note
that every input to the biomarker_synergy_analysis function changes per cell line):
cell.specific.synergy.analysis.res = list()
for (cell.line in cell.lines) {
cell.specific.synergy.analysis.res[[cell.line]] =
biomarker_synergy_analysis(model.predictions.per.cell.line[[cell.line]],
models.stable.state.per.cell.line[[cell.line]],
models.link.operators.per.cell.line[[cell.line]],
observed.synergies.per.cell.line[[cell.line]], threshold = 0.7)
}
Next we run the analysis on the random boolean model datasets (note
that the only input to the biomarker_synergy_analysis function that changes
is the observed synergies per cell line - the rest is the same data from the
random model dataset):
# Synergy Biomarkers for cell proliferation models
random.synergy.analysis.res = list()
for (cell.line in cell.lines) {
random.synergy.analysis.res[[cell.line]] =
biomarker_synergy_analysis(random.model.predictions,
random.models.stable.state, random.models.link.operator,
observed.synergies.per.cell.line[[cell.line]], threshold = 0.7)
}
The results from all the previous code chunks have already been executed and the result data is saved in the cascade_synergy_biomarkers.RData file. We load all the objects:
#save.image(file = "cascade_synergy_biomarkers.RData")
load(file = "cascade_synergy_biomarkers.RData")
Each of the cell lines studied has a different set of observed synergies (drug combinations that were found synergistic across all the 153 tested ones). In this section, we will visualize the cell lines’ observed synergies and mark the synergies that were also predicted by the cell-specific models and the random-generated ones. First, we get the biomarkers for these synergies from each cell line:
total.predicted.synergies.cell.specific =
unique(unlist(sapply(cell.specific.synergy.analysis.res, function(x) { x$predicted.synergies})))
total.predicted.synergies.cell.specific.num = length(total.predicted.synergies.cell.specific)
The same for the random models:
total.predicted.synergies.random =
unique(unlist(sapply(random.synergy.analysis.res, function(x) { x$predicted.synergies})))
total.predicted.synergies.random.num = length(total.predicted.synergies.random)
Then, we get the observed synergies from each cell line in a data.frame:
observed.synergies.res = get_observed_synergies_per_cell_line(cell.line.dirs, drug.combos)
# remove drug comtions which are not observed in any of the cell lines
observed.synergies.res = prune_columns_from_df(observed.synergies.res, value = 0)
total.observed.synergies = colnames(observed.synergies.res)
total.observed.synergies.num = length(total.observed.synergies)
Lastly, we visualize the observed and predicted synergies for all cell lines in one heatmap:
# color the cell-specific predicted synergies
predicted.synergies.colors = rep("black", total.observed.synergies.num)
names(predicted.synergies.colors) = total.observed.synergies
common.predicted.synergies = intersect(total.predicted.synergies.cell.specific,
total.predicted.synergies.random)
cell.specific.only.predicted.synergies =
total.predicted.synergies.cell.specific[!total.predicted.synergies.cell.specific %in% total.predicted.synergies.random]
random.only.predicted.synergies =
total.predicted.synergies.random[!total.predicted.synergies.random %in% total.predicted.synergies.cell.specific]
predicted.synergies.colors[total.observed.synergies %in%
common.predicted.synergies] = "blue"
predicted.synergies.colors[total.observed.synergies %in%
cell.specific.only.predicted.synergies] = "orange"
predicted.synergies.colors[total.observed.synergies %in%
random.only.predicted.synergies] = "purple"
# define a coloring
obs.synergies.col.fun = colorRamp2(c(0, 1), c("red", "green"))
observed.synergies.heatmap =
Heatmap(matrix = as.matrix(observed.synergies.res),
col = obs.synergies.col.fun,
column_title = "Observed synergies per cell line",
column_title_gp = gpar(fontsize = 20),
column_names_gp = gpar(col = predicted.synergies.colors),
row_title = "Cell Lines", row_title_side = "left",
row_dend_side = "right", row_names_side = "left",
rect_gp = gpar(col = "black", lwd = 0.3),
heatmap_legend_param = list(at = c(1, 0), labels = c("YES", "NO"),
color_bar = "discrete", title = "Observed", direction = "vertical"))
lgd = Legend(at = c("Cell-specific", "Random", "Both"), title = "Predicted",
legend_gp = gpar(fill = c("orange", "purple", "blue")))
draw(observed.synergies.heatmap, heatmap_legend_list = list(lgd),
heatmap_legend_side = "right")
AK-BI,
PI-D1)AK-G4 and 5Z-D1 are observed synergies that only the cell-specific models could predict, whereas the G2-P5 and PI-D4 are observed synergies that only the random models could predict. This shows us that a complimentary
approach is needed when searching for biomarkers as the two different kind of
models (trained to a specific activity state profile vs trained to proliferation)
although they share common true positives regarding the synergies they predict,
there are also synergies only a specific class of models could predict.AK-PD, PD-PI and PD-G2 were observed in
the A498 cell line and predicted by both the cell-specific and random models.As observed above, the AK-PD synergy was predicted by both the cell specific and random models in the A498 cell line.
We get the average state and link operator differences per network node for the A498 cell line from the cell-specific models:
AK.PD.avg.state.diff.cell.specific = cell.specific.synergy.analysis.res$A498$diff.state.synergies.mat["AK-PD", ]
AK.PD.avg.link.diff.cell.specific = cell.specific.synergy.analysis.res$A498$diff.link.synergies.mat["AK-PD", ]
We build the network from the topology file:
topology.file = paste0(getwd(), "/topology")
net = construct_network(topology.file = topology.file, models.dir = paste0(getwd(), "/AGS/models"))
# a static layout for plotting the same network always
coordinates.file = paste0(getwd(), "/network_xy_coordinates")
nice.layout = as.matrix(read.table(coordinates.file))
We will now visualize the nodes average state differences in a network graph. Note that the good models are those that predicted the AK-PD drug comtion to be synergistic and were contrasted to those that predicted it to be antagonistic (bad models). The number of models in each category were:
model.predictions = model.predictions.per.cell.line[["A498"]]
models.stable.state = models.stable.state.per.cell.line[["A498"]]
drug.comb = "AK-PD"
good.models.num = sum(model.predictions[, drug.comb] == 1 & !is.na(model.predictions[, drug.comb]))
# unique good models
# models.link.operator = models.link.operators.per.cell.line$A498
# nrow(unique(models.link.operator[model.predictions[, drug.comb] == 1 & !is.na(model.predictions[, drug.comb]), ]))
bad.models.num = sum(model.predictions[, drug.comb] == 0 & !is.na(model.predictions[, drug.comb]))
pretty_print_string(paste0("Number of 'good' models (AK-PD synergistic) in the A498 cell line: ", good.models.num))
Number of ‘good’ models (AK-PD synergistic) in the A498 cell line: 26
pretty_print_string(paste0("Number of 'bad' models (AK-PD antagonistic) in the A498 cell line: ", bad.models.num))
Number of ‘bad’ models (AK-PD antagonistic) in the A498 cell line: 2672
plot_avg_state_diff_graph(net, diff = AK.PD.avg.state.diff.cell.specific,
layout = nice.layout, title = "AK-PD activity state biomarkers (Cell specific models - A498)")
Thus, we can identify the active state biomarkers:
AK.PD.active.biomarkers = AK.PD.avg.state.diff.cell.specific[AK.PD.avg.state.diff.cell.specific > 0.7]
pretty_print_vector_names(AK.PD.active.biomarkers)
1 node: ERK_f
So, the AK-PD synergy manifests in cancer cell models that have the ERK_f family logical node in an active state.
The MAPK-ERK signaling pathway has been studied a lot and has been found to be overexpressed/have increased activity in cancers and as such cancer treatments that include the inhibition of that pathway are found to be most beneficial.
Paper evidence for ERK overexpression in cancer (there are many):
The inhibited state biomarkers are:
AK.PD.inhibited.biomarkers = AK.PD.avg.state.diff.cell.specific[AK.PD.avg.state.diff.cell.specific < -0.7]
pretty_print_vector_names(AK.PD.inhibited.biomarkers)
2 nodes: PTPN11, GAB_f
We will demonstrate that the PTPN11 and GAB_f inhibited biomarkers are a direct consequence of the overexpression of ERK_f:
If we check the logical equations related to the above biomarkers we see that:
pretty_print_string("ERK_f *= ( MEK_f ) AND/OR NOT ( ( DUSP6 ) or PPP1CA )")
ERK_f *= ( MEK_f ) AND/OR NOT ( ( DUSP6 ) or PPP1CA )
pretty_print_string("GAB_f *= ( GRB2 ) AND/OR NOT ( ERK_f )")
GAB_f *= ( GRB2 ) AND/OR NOT ( ERK_f )
pretty_print_string("PTPN11 *= ( GAB_f )")
PTPN11 *= ( GAB_f )
So, pretty much if the GAB_f node is more inhibited in the models that predicted the AK-PD synergy (good models), then PTPN11 is also as well.
Also the average activity difference of the GRB2 node is -0.3434189, which makes the GAB_f node more inhibited in the good models since it’s activity is mostly dependent on the ERK_f node, which is mostly overexpressed in the good models.
All in all, the overexpression of ERK_f is what causes the two other inhibited biomarkers.
We visualize the nodes average link operator differences in a network graph:
plot_avg_link_operator_diff_graph(net, diff = AK.PD.avg.link.diff.cell.specific,
layout = nice.layout, title = "AK-PD link operator biomarkers (Cell specific models - A498)")
So, the models that predicted the AK-PD synergy, had the OR NOT as a link operator in the boolean equation that has the ERK_f node as target, i.e. ERK_f *= ( MEK_f ) OR NOT ( ( DUSP6 ) or PPP1CA ) instead of ERK_f *= ( MEK_f ) AND NOT ( ( DUSP6 ) or PPP1CA ).
The difference in the result of the logical equation can be seen in the next two truth tables where the use of the OR NOT link operator makes the end truth value more flexible in the sense that a lot more more TRUE values are possible, since the activating regulator MEK_f (which is the PD drug’s target) has more weight in the outcome:
If you think it in reverse terms, in cancer models in which the ERK_f equation is mostly based on an AND NOT link operator logic and have the ERK_F node as inactive (also from the equation DUSP6 *= ( ( mTORC1_c ) or ERK_f ), DUSP6 will also be inactive), the MEK_f node will also be mostly inactive (\(2\) out of \(3\) cases in the table above: 1st, 2nd and 4th row) - it can be seen as a node that does not play an important role in the activity outcome of ERK_f - and so by inhibiting it further with the PD drug will not bring any significant benefit for cancer treatment.
Thus, in cancer boolean models where the ERK_f node is overexpressed and the MEK_f logical node is it’s most crucial regulator, inhibiting both the MAPK/ERK pathway (drug PD) and the AKT pathway (drug AK) is a synergistic combination for cancer treatment.
It will be interesting to find all the possible synergy sets and subsets that include the AK-PD as the extra synergy.
Using these synergy sets, models that predict a set of synergies will be contrasted to models that predicted the same set with the addition of the extra AK-PD synergy. Thus we could find synergy biomarkers that allow already good predicting models to predict the additional synergy of interest.
This investigation will allow us thus to refine the activity state and link operator biomarkers we found above.
We first construct two matrices: in the first, each row is a set vs subset average activity difference vector of the network nodes, while on the second each row is a set vs subset average link operator difference vector of the network nodes:
model.predictions = model.predictions.per.cell.line$A498
models.stable.state = models.stable.state.per.cell.line$A498
models.link.operator = models.link.operators.per.cell.line$A498
res = get_synergy_comparison_sets(cell.specific.synergy.analysis.res$A498$synergy.subset.stats)
AK.PD.res = res %>% filter(synergies == "AK-PD")
diff.state.list = list()
diff.link.list = list()
for (i in 1:nrow(AK.PD.res)) {
synergy.set = AK.PD.res[i, 2]
synergy.subset = AK.PD.res[i, 3]
synergy.set.str = unlist(strsplit(x = synergy.set, split = ","))
synergy.subset.str = unlist(strsplit(x = synergy.subset, split = ","))
# count models
synergy.set.models.num = count_models_that_predict_synergies(synergy.set.str, model.predictions)
synergy.subset.models.num = count_models_that_predict_synergies(synergy.subset.str, model.predictions)
# if too small number of models, skip the diff vector
if ((synergy.set.models.num <= 3) | (synergy.set.models.num <= 3))
next
# get the diff
diff.state.ak.pd = get_avg_activity_diff_based_on_synergy_set_cmp(synergy.set.str, synergy.subset.str, model.predictions, models.stable.state)
diff.link.ak.pd = get_avg_link_operator_diff_based_on_synergy_set_cmp(synergy.set.str, synergy.subset.str, model.predictions, models.link.operator)
diff.state.list[[paste0(synergy.set, " vs ", synergy.subset)]] = diff.state.ak.pd
diff.link.list[[paste0(synergy.set, " vs ", synergy.subset)]] = diff.link.ak.pd
}
diff.state.mat = do.call(rbind, diff.state.list)
diff.link.mat = do.call(rbind, diff.link.list)
caption.title.1 = "Table 1: Average activity difference values across all synergy set comparisons (AK-PD)"
datatable(data = diff.state.mat[, c("SRC", "CSK", "MEK_f", "STAT1", "PTPN6")], options = list(
searching = FALSE, pageLength = 5, lengthMenu = c(5, 10)),
caption = htmltools::tags$caption(caption.title.1, style="color:#dd4814; font-size: 18px")) %>%
formatRound(1:ncol(diff.state.mat), digits = 3)
caption.title.2 = "Table 2: Average link operator difference values across all synergy set comparisons (AK-PD)"
datatable(data = diff.link.mat[, c("SRC", "RAC_f", "MEK_f", "STAT1", "PTEN")], options = list(
searching = FALSE, pageLength = 5, lengthMenu = c(5, 10)),
caption = htmltools::tags$caption(caption.title.2, style="color:#dd4814; font-size: 18px")) %>%
formatRound(1:ncol(diff.link.mat), digits = 3)
Using the matrix from Table 1 (where we show just \(5\) nodes), we count per network node the number of times that the node’s average activity difference value has surpassed a specified threshold - i.e. the number of times it has been found as important (a biomarker) across all the synergy set comparisons (so the more the better):
threshold = 0.8
biomarker.state.mat = binarize_to_thres(mat = diff.state.mat, threshold)
biomarker.state.counts = colSums(biomarker.state.mat)
pretty_print_vector_names_and_values(table(biomarker.state.counts))
0: 141, 1: 1, 11: 2
The above means that there were \(141\) nodes that were zero times found as activity state biomarkers across all synergy set comparisons, one that was found once and \(2\) that were found \(11\) times (out of a total of 19). These nodes are:
pretty_print_vector_names(biomarker.state.counts[biomarker.state.counts == 11])
2 nodes: SRC, PTPN6
If we visualize the average activity state difference across all synergy comparison sets from Table 1 we can see that the previous \(2\) nodes are more pronounced:
plot_avg_state_diff_graph(net, diff = colMeans(diff.state.mat), layout = nice.layout,
title = "Average activity state diff across all synergy subsets (AK-PD)")
We also visualize the median activity difference per node from Table 1 using a dotchart, where we have excluded the nodes that have an absolute median activity difference of \(0.05\) or less:
df = as.data.frame(apply(diff.state.mat, 2, median))
colnames(df) = "median.state.diff"
df = df %>%
rownames_to_column(var = "nodes") %>%
filter(median.state.diff > 0.05 | median.state.diff < -0.05)
ggdotchart(df, x = "nodes", y = "median.state.diff",
title = "Median activity state difference across all synergy comparison sets (AK-PD, A498 cell line)",
label = "nodes", font.label = list(size = 11, color = "blue"),
label.select = list(criteria = "`y` >= 0.7 | `y` <= -0.7"),
repel = TRUE, label.rectangle = TRUE,
xlab = "Nodes", ylab = "Median Activity State",
ylim = c(-1,1), add = "segments") +
font("x.text", size = 10) +
font("title", size = 11) +
geom_hline(yintercept = -0.7, linetype = "dashed", color = "red") +
geom_hline(aes(yintercept = 0.7, linetype = "0.7"), color = "red") +
scale_linetype_manual(name = "Threshold", values = 2,
guide = guide_legend(override.aes = list(color = "red"))) +
theme(legend.position = c(0.2,0.7))
Note the boolean equation: PTPN6 *= SRC. This means that SRC is the only inhibited node of interest
We will now check if the above observations regarding the activity of the nodes are true using the MCLP dataset as a reference:
mclp.data = read.table(file = "MCLP-v1.1-Level4.tsv", header = TRUE, stringsAsFactors = FALSE)
cell.lines.in.mclp = c("A498", "AGS", "DU145", "COLO205", "SW620", "SF295", "UACC62", "MDAMB468")
We check the phosphorylation value of the SRC_pY527 across all cell lines:
src.data = mclp.data %>% select(Sample_Name, SRC_pY527) %>% na.omit()
# find the activity classes (3 classes: low activity, no activity, high activity)
res = Ckmeans.1d.dp(x = src.data$SRC, k = 3)
activity = as.factor(res$cluster)
levels(activity) = c("low", "medium", "high")
src.data = cbind(src.data, activity)
ggdotchart(data = src.data, x = "Sample_Name", y = "SRC_pY527",
title = "SRC_pY527 signaling across all cell lines in MCLP Dataset",
color = "activity", palette = c("red", "grey", "green"),
label = "Sample_Name", label.select = cell.lines.in.mclp, repel = TRUE,
add = "segments", label.rectangle = TRUE,
xlab = "Cell Lines", ylab = "SRC_pY527 Signaling",
add.params = list(color = "activity", palette = c("red", "grey", "green"))) +
theme(axis.text.x = element_blank(), axis.ticks = element_blank())
It has been shown in various studies (e.g. in this paper) that the phosphorylation sites of SRC include the inhibiting phosphotyrosine 527 site, which overrides the Y416 phosphorylation (which is also tested in the MCLP dataset).
Since for A498 cell line we observe one of the largest signaling measurements compared to all the cell lines for the SRC_pY527 site, this means that SRC should be in an inhibiting state, which is exactly what we found from our analysis above.
We also check for the phosphorylation value of the MAPK_pT202Y204 (for the ERK_f activation) across all cell lines:
erk.data = mclp.data %>% select(Sample_Name, MAPK_pT202Y204) %>% na.omit()
# find the activity classes (3 classes: low expression, no expression, high expression)
res = Ckmeans.1d.dp(x = erk.data$MAPK_pT202Y204, k = 3)
activity = as.factor(res$cluster)
levels(activity) = c("low", "medium", "high")
erk.data = cbind(erk.data, activity)
ggdotchart(data = erk.data, x = "Sample_Name", y = "MAPK_pT202Y204",
title = "MAPK_pT202Y204 signaling across all cell lines in MCLP Dataset",
color = "activity", palette = c("red", "grey", "green"),
label = "Sample_Name", label.select = cell.lines.in.mclp, repel = TRUE, xlab = "Cell Lines",
add = "segments", label.rectangle = TRUE, ylab = "MAPK_pT202Y204 Signaling",
add.params = list(color = "activity", palette = c("red", "grey", "green"))) +
theme(axis.text.x = element_blank(), axis.ticks = element_blank())
One of the activating phosphorylation sites of ERK is the ERK1 T202/Y204.
From the above figure we see a weak signaling of this phoshosite for the A498 cell line which does not fully correlate well with the results from our analysis above (ERK_f overexpression). Though, in a later section we show that ERK_f is found active in all models from all cell lines that predict the AK-PD synergy (for example even in UACC62 cell line which has a high phosphorylation signal in the above figure).
Using the matrix from Table 2, we count per network node the number of times that the node’s average link operator difference value has surpassed a specified threshold - i.e. the number of times it has been found as important (a biomarker) across all the synergy set comparisons (so the more the better):
biomarker.link.mat = binarize_to_thres(mat = diff.link.mat, thres = 0.8)
biomarker.link.counts = colSums(biomarker.link.mat)
pretty_print_vector_names_and_values(table(biomarker.link.counts))
0: 44, 1: 1, 2: 1, 6: 5, 11: 1
The above means that there were \(44\) nodes that were zero times found as link operator biomarkers across all synergy set comparisons, one that was found once, one that was found twice, \(5\) that were found 6 times and \(1\) that was found 11 times. The last two categories of nodes are:
pretty_print_vector_names(biomarker.link.counts[biomarker.link.counts == 6])
5 nodes: TGFBR2, mTORC1_c, TSC_f, CDC42, RHOA
pretty_print_vector_names(biomarker.link.counts[biomarker.link.counts == 11])
1 node: SRC
If we visualize the average link operator difference across all of the synergy comparison sets from Table 2, we can see the nodes mentioned above:
plot_avg_link_operator_diff_graph(net, diff = colMeans(diff.link.mat), layout = nice.layout,
title = "Average link operator diff across all synergy subsets (AK-PD)")
Some notes/observations on the structure of the boolean equations of the nodes found above:
The boolean equation of the SRC node is: SRC *= ( RTPK_f ) AND/OR NOT ( CSK ) and it’s mean link operator difference value across all synergy set comparisons is -0.5721936 which means that on average the logic operator that binds its two regulators is the AND NOT and thus it’s more difficult to have it as activated (1 case out of 4 in boolean logic). This correlates with the fact that it was found as mostly inhibited in the analysis above. Also note the equations:
CSK *= ( PRKACA )PRKACA *= ( ( NFKB_f ) or FOS )FOS *= ( ( ( ERK_f ) or RSK_f ) or SRF )So, when ERK_f is overexpressed, CSK becomes active, which means that the prevalence of the AND NOT link operator makes the activity of the SRC node dependent only on it’s inhibitor CSK (since it’s active) and not on the activity of RTPK_f node.
mTORC1_c *= ( ( RHEB ) or RSK_f ) AND/OR NOT ( AKT1S1 )AKT1S1 *= not ( AKT_f )mTORC1_c’s mean link operator difference value across all synergy set comparisons is 0.699427 which means that on average the logic operator that binds its two regulators is the OR NOT.
This gives the node the structural flexibility to become active when the AK drug is used: AK inhibits AKT_f node, making thus AKT1S1 active, which in turn makes the mTORC1_c equation like this: mTORC1_c *= ( ( RHEB ) or RSK_f ) AND/OR 0. So, an AND link operator would result always in an inhibited mTORC1_c whereas an OR link operator would give the possibility for the mTORC1_c to be active in case one of its activators are active.
We conclude that cancer models in which the AK-PD drug combination is synergistic, tend to also have the nodes SRC and PTPN6 in an inhibited state and the SRC node depends on both it’s regulators, RTPK_f and CSK.
The link operators of the mTORC1_c and TSC_f equations also seem to have an important role in the manifestation of this synergy.
We get the average state and link operator differences per network node for the A498 cell line from the random models:
AK.PD.avg.state.diff.random = random.synergy.analysis.res$A498$diff.state.synergies.mat["AK-PD", ]
AK.PD.avg.link.diff.random = random.synergy.analysis.res$A498$diff.link.synergies.mat["AK-PD", ]
We will now visualize the nodes average state differences in a network graph. Note that the good models are those that predicted the AK-PD drug combination to be synergistic and were contrasted to those that predicted it to be antagonistic (bad models). The number of models in each category were:
drug.comb = "AK-PD"
good.models.num = sum(random.model.predictions[, drug.comb] == 1 & !is.na(random.model.predictions[, drug.comb]))
# unique good models
# nrow(unique(random.models.link.operator[random.model.predictions[, drug.comb] == 1 & !is.na(random.model.predictions[, drug.comb]),]))
bad.models.num = sum(random.model.predictions[, drug.comb] == 0 & !is.na(random.model.predictions[, drug.comb]))
pretty_print_string(paste0("Number of 'good' random models (AK-PD synergistic) in the A498 cell line: ", good.models.num))
Number of ‘good’ random models (AK-PD synergistic) in the A498 cell line: 107
pretty_print_string(paste0("Number of 'bad' random models (AK-PD antagonistic) in the A498 cell line: ", bad.models.num))
Number of ‘bad’ random models (AK-PD antagonistic) in the A498 cell line: 3632
plot_avg_state_diff_graph(net, diff = AK.PD.avg.state.diff.random,
layout = nice.layout, title = "AK-PD activity state biomarkers (Random models - A498)")
Thus, we can identify the active state biomarkers (note that no inhibited biomarkers at the specified threshold difference level were found):
AK.PD.active.biomarkers = AK.PD.avg.state.diff.random[AK.PD.avg.state.diff.random > 0.8]
pretty_print_vector_names(AK.PD.active.biomarkers)
1 node: ERK_f
The random models that predicted the AK-PD synergy also show an overexpression of the ERK_f node.
We visualize the nodes average link operator differences in a network graph:
plot_avg_link_operator_diff_graph(net, diff = AK.PD.avg.link.diff.random,
layout = nice.layout, title = "AK-PD link operator biomarkers (Random models - A498)")
The importance of the OR NOT link operator in the boolean equation of ERK_f is again proven to be crucial for the manifestation of the AK-PD synergy, along with the link operators of the equations of the MEK_f, PTEN and PDPK1 nodes.
We perform the same kind of analysis as with the cell-specific models: models that predict a set of synergies will be contrasted to models that predicted the same set with the addition of the extra AK-PD synergy, allowing us thus to refine the activity state and link operator biomarkers we found above from the random models.
We first construct two matrices: in the first, each row is a set vs subset average activity difference vector of the network nodes, while on the second each row is a set vs subset average link operator difference vector of the network nodes:
res = get_synergy_comparison_sets(random.synergy.analysis.res$A498$synergy.subset.stats)
AK.PD.res = res %>% filter(synergies == "AK-PD")
diff.state.list.random = list()
diff.link.list.random = list()
for (i in 1:nrow(AK.PD.res)) {
synergy.set = AK.PD.res[i, 2]
synergy.subset = AK.PD.res[i, 3]
synergy.set.str = unlist(strsplit(x = synergy.set, split = ","))
synergy.subset.str = unlist(strsplit(x = synergy.subset, split = ","))
# count models
synergy.set.models.num = count_models_that_predict_synergies(synergy.set.str, random.model.predictions)
synergy.subset.models.num = count_models_that_predict_synergies(synergy.subset.str, random.model.predictions)
# print(paste0(synergy.set.models.num, " ", synergy.subset.models.num))
# if too small number of models, skip the diff vector
if ((synergy.set.models.num <= 3) | (synergy.set.models.num <= 3))
next
# get the diff
diff.state.ak.pd = get_avg_activity_diff_based_on_synergy_set_cmp(synergy.set.str, synergy.subset.str, random.model.predictions, random.models.stable.state)
diff.link.ak.pd = get_avg_link_operator_diff_based_on_synergy_set_cmp(synergy.set.str, synergy.subset.str, random.model.predictions, random.models.link.operator)
diff.state.list.random[[paste0(synergy.set, " vs ", synergy.subset)]] = diff.state.ak.pd
diff.link.list.random[[paste0(synergy.set, " vs ", synergy.subset)]] = diff.link.ak.pd
}
diff.state.mat.random = do.call(rbind, diff.state.list.random)
diff.link.mat.random = do.call(rbind, diff.link.list.random)
caption.title.3 = "Table 3: Average activity difference values across all synergy set comparisons (AK-PD)"
datatable(data = diff.state.mat.random[, c("SRC", "CSK", "MEK_f", "STAT1", "PTPN6")], options = list(
searching = FALSE, pageLength = 5, lengthMenu = c(5, 10)),
caption = htmltools::tags$caption(caption.title.3, style="color:#dd4814; font-size: 18px")) %>%
formatRound(1:ncol(diff.state.mat.random), digits = 3)
caption.title.4 = "Table 4: Average link operator difference values across all synergy set comparisons (AK-PD)"
datatable(data = diff.link.mat.random[, c("SRC", "RAC_f", "MEK_f", "STAT1", "PTEN")], options = list(
searching = FALSE, pageLength = 5, lengthMenu = c(5, 10)),
caption = htmltools::tags$caption(caption.title.4, style="color:#dd4814; font-size: 18px")) %>%
formatRound(1:ncol(diff.link.mat.random), digits = 3)
Using the matrix from Table 3 (where we show just \(5\) nodes), we count per network node the number of times that the node’s average activity difference value has surpassed a specified threshold - i.e. the number of times it has been found as important (a biomarker) across all the synergy set comparisons (so the more the better):
biomarker.state.mat.random = binarize_to_thres(mat = diff.state.mat.random, thres = 0.7)
biomarker.state.counts.random = colSums(biomarker.state.mat.random)
pretty_print_vector_names_and_values(table(biomarker.state.counts.random))
0: 139, 1: 1, 2: 1, 4: 1, 6: 2
So, there are \(2\) nodes that were found as activity state biomarkers \(6\) times across all synergy set comparisons. These nodes are:
pretty_print_vector_names(biomarker.state.counts.random[biomarker.state.counts.random == 6])
2 nodes: PDPK1, PRKCD
But the total number of comparisons was 49 so we could argue that this is not a statistically significant result, which can be seen clearly in the next graph where we visualize the average activity state difference across all synergy comparison sets from Table 3:
plot_avg_state_diff_graph(net, diff = colMeans(diff.state.mat.random), layout = nice.layout,
title = "Average activity state diff across all synergy subsets (AK-PD)")
Using the matrix from Table 4, we count per network node the number of times that the node’s average link operator difference value has surpassed a specified threshold - i.e. the number of times it has been found as important (a biomarker) across all the synergy set comparisons (so the more the better):
biomarker.link.mat.random = binarize_to_thres(mat = diff.link.mat.random, thres = 0.7)
biomarker.link.counts.random = colSums(biomarker.link.mat.random)
pretty_print_vector_names_and_values(table(biomarker.link.counts.random))
0: 50, 1: 1, 7: 1
So, there was a node that was found \(7\) times (out of a total of 49, so not statistically significant) as a link operator biomarkers across all synergy set comparisons:
pretty_print_vector_names(biomarker.link.counts.random[biomarker.link.counts.random == 7])
1 node: mTORC1_c
We visualize the average link operator difference across all synergy comparison sets from Table 4:
plot_avg_link_operator_diff_graph(net, diff = colMeans(diff.link.mat.random), layout = nice.layout,
title = "Average link operator diff across all synergy subsets (AK-PD)")
Though the AK-PD synergy was observed and predicted in the A498 model dataset, we investigate its biomarkers in the other cell lines where it was predicted as a False Positive (FP) synergy (predicted by the models but not observed in the experiments).
Thus, we can still see if there are any common (activity state and link operator) biomarkers for AK-PD across the all the cell-specific models by contrasting in each cell line the models that predicted AK-PD vs the models that did not:
drug.comb = "AK-PD"
ak.pd.diff.state.list = list()
ak.pd.diff.link.list = list()
for (cell.line in cell.lines) {
if (cell.line == "A498")
next
ak.pd.diff.state.list[[cell.line]] = get_avg_activity_diff_based_on_specific_synergy_prediction(
model.predictions = model.predictions.per.cell.line[[cell.line]],
models.stable.state = models.stable.state.per.cell.line[[cell.line]],
drug.comb)
ak.pd.diff.link.list[[cell.line]] = get_avg_link_operator_diff_based_on_specific_synergy_prediction(
model.predictions = model.predictions.per.cell.line[[cell.line]],
models.link.operator = models.link.operators.per.cell.line[[cell.line]],
drug.comb)
# print(count_models_that_predict_synergies(drug.comb, model.predictions = model.predictions.per.cell.line[[cell.line]]))
}
ak.pd.diff.state.mat = do.call(rbind, ak.pd.diff.state.list)
ak.pd.diff.link.mat = do.call(rbind, ak.pd.diff.link.list)
ERK_f was the one node that was found as an activity state & link operator AK-PD biomarker in all cell lines
ak.pd.biomarker.state.mat = binarize_to_thres(mat = ak.pd.diff.state.mat, thres = 0.7)
ak.pd.biomarker.link.mat = binarize_to_thres(mat = ak.pd.diff.link.mat, thres = 0.7)
ak.pd.biomarker.state.mat.counts = colSums(ak.pd.biomarker.state.mat)
ak.pd.biomarker.link.mat.counts = colSums(ak.pd.biomarker.link.mat)
#pretty_print_vector_names_and_values(table(ak.pd.biomarker.state.mat.counts))
#pretty_print_vector_names_and_values(table(ak.pd.biomarker.link.mat.counts))
pretty_print_vector_names(ak.pd.biomarker.state.mat.counts[ak.pd.biomarker.state.mat.counts == 7])
1 node: ERK_f
pretty_print_vector_names(ak.pd.biomarker.link.mat.counts[ak.pd.biomarker.link.mat.counts == 6])
1 node: ERK_f
As observed in the heatmap above, the PD-PI synergy was predicted by both the cell specific and random models in the A498 cell line.
PD.PI.avg.state.diff.cell.specific = cell.specific.synergy.analysis.res$A498$diff.state.synergies.mat["PD-PI",]
#PD.PI.avg.link.diff.cell.specific = cell.specific.synergy.analysis.res$A498$diff.link.synergies.mat["PD-PI", ]
We will now visualize the nodes average state differences in a network graph. Note that the good models are those that predicted the PD-PI drug combination to be synergistic and were contrasted to those that predicted it to be antagonistic (bad models). The number of models in each category were:
model.predictions = model.predictions.per.cell.line[["A498"]]
models.stable.state = models.stable.state.per.cell.line[["A498"]]
drug.comb = "PD-PI"
good.models.num = sum(model.predictions[, drug.comb] == 1 & !is.na(model.predictions[, drug.comb]))
# unique good models
# models.link.operator = models.link.operators.per.cell.line$A498
# nrow(unique(models.link.operator[model.predictions[, drug.comb] == 1 & !is.na(model.predictions[, drug.comb]), ]))
bad.models.num = sum(model.predictions[, drug.comb] == 0 & !is.na(model.predictions[, drug.comb]))
pretty_print_string(paste0("Number of 'good' models (PD-PI synergistic) in the A498 cell line: ", good.models.num))
Number of ‘good’ models (PD-PI synergistic) in the A498 cell line: 107
pretty_print_string(paste0("Number of 'bad' models (PD-PI antagonistic) in the A498 cell line: ", bad.models.num))
Number of ‘bad’ models (PD-PI antagonistic) in the A498 cell line: 6873
plot_avg_state_diff_graph(net, diff = PD.PI.avg.state.diff.cell.specific,
layout = nice.layout, title = "PD-PI activity state biomarkers (Cell specific models - A498)")
We now visualize the nodes average state differences in a dotchart where we can easily identify the active and inhibited state biomarkers with the defined thresholds (we have excluded nodes whose absolute average difference value was less than \(0.05\)):
df = as.data.frame(PD.PI.avg.state.diff.cell.specific)
colnames(df) = "avg.state.diff"
df = df %>%
rownames_to_column(var = "nodes") %>%
filter(avg.state.diff > 0.05 | avg.state.diff < -0.05)
ggdotchart(df, x = "nodes", y = "avg.state.diff",
title = "Average activity state difference (Good - bad) - A498 cell line",
label = "nodes", font.label = list(size = 11, color = "blue"),
label.select = list(criteria = "`y` >= 0.7 | `y` <= -0.5"),
repel = TRUE, label.rectangle = TRUE,
xlab = "Nodes", ylab = "Average Activity State",
ylim = c(-1,1), add = "segments") +
font("x.text", size = 8) +
font("title", size = 14) +
geom_hline(yintercept = -0.7, linetype = "dashed", color = "red") +
geom_hline(aes(yintercept = 0.7, linetype = "0.7"), color = "red") +
scale_linetype_manual(name = "Threshold", values = 2,
guide = guide_legend(override.aes = list(color = "red"))) +
theme(legend.position = c(0.2,0.7)) +
annotate("text", x = c(18, 35), y = -0.9, label = c("Good models: PD-PI synergy,", "Bad Models: PD-PI antagonism"))
Note the boolean equation: PPM1A *= ( PTEN ). This means that PTEN is the only inhibited node of interest
We now check the PTEN in the MCLP dataset:
pten.data = mclp.data %>% select(Sample_Name, PTEN) %>% na.omit()
# find the activity classes (3 classes: low activity, no activity, high activity)
res = Ckmeans.1d.dp(x = pten.data$PTEN, k = 3)
activity = as.factor(res$cluster)
levels(activity) = c("low", "medium", "high")
pten.data = cbind(pten.data, activity)
ggdotchart(data = pten.data, x = "Sample_Name", y = "PTEN",
title = "PTEN signaling across all cell lines in MCLP Dataset",
color = "activity", palette = c("red", "grey", "green"),
label = "Sample_Name", label.select = cell.lines.in.mclp, repel = TRUE,
add = "segments", label.rectangle = TRUE,
xlab = "Cell Lines", ylab = "PTEN Signaling",
add.params = list(color = "activity", palette = c("red", "grey", "green"))) +
theme(axis.text.x = element_blank(), axis.ticks = element_blank())
The PTEN node is found highly expressed in the A498 cell line, but for the models to predict the PD-PI synergy we found that the node needs to be in an inhibited state (see here). The overexpression of the ERK_f node is also a result from this analysis.
It will be interesting to find all the possible synergy sets and subsets that include the PD-PI as the extra synergy.
Using these synergy sets, models that predict a set of synergies will be contrasted to models that predicted the same set with the addition of the extra PD-PI synergy.
Thus we could find synergy biomarkers that allow already good predicting models to predict the additional synergy of interest.
This investigation will allow us thus to refine the activity state biomarkers we found above.
model.predictions = model.predictions.per.cell.line$A498
models.stable.state = models.stable.state.per.cell.line$A498
models.link.operator = models.link.operators.per.cell.line$A498
res = get_synergy_comparison_sets(cell.specific.synergy.analysis.res$A498$synergy.subset.stats)
PD.PI.res = res %>% filter(synergies == "PD-PI")
diff.state.list = list()
for (i in 1:nrow(PD.PI.res)) {
synergy.set = PD.PI.res[i, 2]
synergy.subset = PD.PI.res[i, 3]
synergy.set.str = unlist(strsplit(x = synergy.set, split = ","))
synergy.subset.str = unlist(strsplit(x = synergy.subset, split = ","))
# count models
synergy.set.models.num = count_models_that_predict_synergies(synergy.set.str, model.predictions)
synergy.subset.models.num = count_models_that_predict_synergies(synergy.subset.str, model.predictions)
# if too small number of models, skip the diff vector
# if ((synergy.set.models.num <= 3) | (synergy.set.models.num <= 3))
# next
# get the diff
diff.state.PD.PI = get_avg_activity_diff_based_on_synergy_set_cmp(synergy.set.str, synergy.subset.str, model.predictions, models.stable.state)
diff.state.list[[paste0(synergy.set, " vs ", synergy.subset)]] = diff.state.PD.PI
}
diff.state.mat = do.call(rbind, diff.state.list)
caption.title.1 = "Table 5: Average activity difference values across all synergy set comparisons (PD-PI)"
datatable(data = diff.state.mat[, c("SRC", "CSK", "MEK_f", "STAT1", "PTPN6")], options = list(
searching = FALSE, pageLength = 5, lengthMenu = c(5, 10)),
caption = htmltools::tags$caption(caption.title.1, style="color:#dd4814; font-size: 18px")) %>%
formatRound(1:ncol(diff.state.mat), digits = 3)
If we visualize the average activity state difference across all synergy comparison sets from Table 5 in a network graph, we see that there are no nodes with significant average state difference:
plot_avg_state_diff_graph(net, diff = colMeans(diff.state.mat), layout = nice.layout,
title = "Average activity state diff across all synergy subsets (PD-PI)")
We also visualize the median activity difference per node from Table 5 using a dotchart, where we have excluded the nodes that have an absolute median activity difference of \(0.05\) or less:
df = as.data.frame(apply(diff.state.mat, 2, median))
colnames(df) = "median.state.diff"
df = df %>%
rownames_to_column(var = "nodes") %>%
filter(median.state.diff > 0.05 | median.state.diff < -0.05)
ggdotchart(df, x = "nodes", y = "median.state.diff",
title = "Median activity state difference across all synergy comparison sets (PD-PI, A498 cell line)",
label = "nodes", font.label = list(size = 11, color = "blue"),
label.select = list(criteria = "`y` >= 0.7 | `y` <= -0.7"),
repel = TRUE, label.rectangle = TRUE,
xlab = "Nodes", ylab = "Median Activity State",
ylim = c(-1,1), add = "segments") +
font("x.text", size = 10) +
font("title", size = 11) +
geom_hline(yintercept = -0.7, linetype = "dashed", color = "red") +
geom_hline(aes(yintercept = 0.7, linetype = "0.7"), color = "red") +
scale_linetype_manual(name = "Threshold", values = 2,
guide = guide_legend(override.aes = list(color = "red"))) +
theme(legend.position = c(0.2,0.7))
So no significant nodes whose activity plays important role in the manifestation of the PD-PI synergy were found with the synergy subset analysis method.
We get the average state differences per network node for the A498 cell line from the random models:
PD.PI.avg.state.diff.random = random.synergy.analysis.res$A498$diff.state.synergies.mat["PD-PI",]
We will now visualize the nodes average state differences in a network graph. Note that the good models are those that predicted the PD-PI drug combination to be synergistic and were contrasted to those that predicted it to be antagonistic (bad models). The number of models in each category were:
drug.comb = "PD-PI"
good.models.num = sum(random.model.predictions[, drug.comb] == 1 & !is.na(random.model.predictions[, drug.comb]))
# unique good models
# nrow(unique(random.models.link.operator[random.model.predictions[, drug.comb] == 1 & !is.na(random.model.predictions[, drug.comb]),]))
bad.models.num = sum(random.model.predictions[, drug.comb] == 0 & !is.na(random.model.predictions[, drug.comb]))
pretty_print_string(paste0("Number of 'good' random models (PD-PI synergistic) in the A498 cell line: ", good.models.num))
Number of ‘good’ random models (PD-PI synergistic) in the A498 cell line: 719
pretty_print_string(paste0("Number of 'bad' random models (PD-PI antagonistic) in the A498 cell line: ", bad.models.num))
Number of ‘bad’ random models (PD-PI antagonistic) in the A498 cell line: 6066
plot_avg_state_diff_graph(net, diff = PD.PI.avg.state.diff.random,
layout = nice.layout, title = "PD-PI activity state biomarkers (Random models - A498)")
We now visualize the nodes average state differences in a dotchart where we can easily identify the active and inhibited state biomarkers with the defined thresholds (we have excluded nodes whose absolute average difference value was less than \(0.05\)):
df = as.data.frame(PD.PI.avg.state.diff.random)
colnames(df) = "avg.state.diff"
df = df %>%
rownames_to_column(var = "nodes") %>%
filter(avg.state.diff > 0.05 | avg.state.diff < -0.05)
ggdotchart(df, x = "nodes", y = "avg.state.diff",
title = "Average activity state difference for Random Models (Good - bad) - A498 cell line",
label = "nodes", font.label = list(size = 10, color = "blue"),
label.select = list(criteria = "`y` >= 0.7 | `y` <= -0.5"),
repel = TRUE, label.rectangle = TRUE,
xlab = "Nodes", ylab = "Average Activity State",
ylim = c(-1,1), add = "segments") +
font("x.text", size = 8) +
font("title", size = 14) +
geom_hline(yintercept = -0.7, linetype = "dashed", color = "red") +
geom_hline(aes(yintercept = 0.7, linetype = "0.7"), color = "red") +
scale_linetype_manual(name = "Threshold", values = 2,
guide = guide_legend(override.aes = list(color = "red"))) +
theme(legend.position = c(0.2,0.7)) +
annotate("text", x = c(8, 17), y = -0.9, label = c("Good models: PD-PI synergy,", "Bad Models: PD-PI antagonism"))
The overexpression of ERK_f is also a characteristic of the random models that predict the PD-PI drug combination to be synergistic.
We perform the same kind of analysis as with the cell-specific models: models that predict a set of synergies will be contrasted to models that predicted the same set with the addition of the extra PD-PI synergy, allowing us thus to refine the activity state biomarkers we found above from the random models.
res = get_synergy_comparison_sets(random.synergy.analysis.res$A498$synergy.subset.stats)
PD.PI.res = res %>% filter(synergies == "PD-PI")
diff.state.list.random = list()
for (i in 1:nrow(PD.PI.res)) {
synergy.set = PD.PI.res[i, 2]
synergy.subset = PD.PI.res[i, 3]
synergy.set.str = unlist(strsplit(x = synergy.set, split = ","))
synergy.subset.str = unlist(strsplit(x = synergy.subset, split = ","))
# count models
synergy.set.models.num = count_models_that_predict_synergies(synergy.set.str, random.model.predictions)
synergy.subset.models.num = count_models_that_predict_synergies(synergy.subset.str, random.model.predictions)
# print(paste0(synergy.set.models.num, " ", synergy.subset.models.num))
# if too small number of drug combinations in the subset, skip the diff vector
#if (length(synergy.subset.str) <= 3) next
# if too small number of models compared
#if ((synergy.set.models.num <= 3) | (synergy.set.models.num <= 3)) next
# get the diff
diff.state.PD.PI = get_avg_activity_diff_based_on_synergy_set_cmp(synergy.set.str, synergy.subset.str, random.model.predictions, random.models.stable.state)
diff.state.list.random[[paste0(synergy.set, " vs ", synergy.subset)]] = diff.state.PD.PI
}
diff.state.mat.random = do.call(rbind, diff.state.list.random)
caption.title.6 = "Table 6: Average link operator difference values across all synergy set comparisons (PD-PI)"
datatable(data = diff.state.mat.random[, c("SRC", "RAC_f", "MEK_f", "STAT1", "PTEN")], options = list(
searching = FALSE, pageLength = 5, lengthMenu = c(5, 10)),
caption = htmltools::tags$caption(caption.title.6, style="color:#dd4814; font-size: 18px")) %>%
formatRound(1:ncol(diff.state.mat.random), digits = 3)
If we visualize the average activity state difference across all synergy comparison sets from Table 6 we see that there are no nodes with significant average state difference:
plot_avg_state_diff_graph(net, diff = colMeans(diff.state.mat.random), layout = nice.layout,
title = "Average activity state diff across all synergy subsets (PD-PI)")
We also visualize the median activity difference per node from Table 6 using a dotchart, where we have excluded the nodes that have an absolute median activity difference of \(0.05\) or less:
df = as.data.frame(apply(diff.state.mat.random, 2, median))
colnames(df) = "median.state.diff"
df = df %>%
rownames_to_column(var = "nodes") %>%
filter(median.state.diff > 0.05 | median.state.diff < -0.05)
ggdotchart(df, x = "nodes", y = "median.state.diff",
title = "Median activity state difference across all synergy comparison sets (PD-PI, Random Models, A498)",
label = "nodes", font.label = list(size = 11, color = "blue"),
label.select = list(criteria = "`y` >= 0.7 | `y` <= -0.7"),
repel = TRUE, label.rectangle = TRUE,
xlab = "Nodes", ylab = "Median Activity State",
ylim = c(-1,1), add = "segments") +
font("x.text", size = 10) +
font("title", size = 11) +
geom_hline(yintercept = -0.7, linetype = "dashed", color = "red") +
geom_hline(aes(yintercept = 0.7, linetype = "0.7"), color = "red") +
scale_linetype_manual(name = "Threshold", values = 2,
guide = guide_legend(override.aes = list(color = "red"))) +
theme(legend.position = c(0.2,0.7))
So no significant nodes whose activity plays important role in the manifestation of the PD-PI synergy were found with the synergy subset analysis method for the random models.
As observed in the heatmap above, the PD-G2 synergy was predicted by both the cell specific and random models in the A498 cell line.
As observed in the heatmap above, the PI-D1 synergy was predicted by both the cell specific and random models in all the cell lines tested.
xfun::session_info()
R version 3.6.2 (2019-12-12)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.4 LTS
Locale:
LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
LC_PAPER=en_US.UTF-8 LC_NAME=C
LC_ADDRESS=C LC_TELEPHONE=C
LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
Package version:
assertthat_0.2.1 backports_1.1.5 base64enc_0.1-3
BH_1.72.0.3 bibtex_0.4.2.2 bookdown_0.17
circlize_0.4.8 Ckmeans.1d.dp_4.3.0 cli_2.0.1
clue_0.3-57 cluster_2.1.0 colorspace_1.4-1
compiler_3.6.2 ComplexHeatmap_2.0.0 cowplot_1.0.0
crayon_1.3.4 crosstalk_1.0.0 digest_0.6.23
dplyr_0.8.3 DT_0.11 ellipsis_0.3.0
emba_0.1.2 evaluate_0.14 fansi_0.4.1
farver_2.0.3 fastmap_1.0.1 gbRd_0.4-11
GetoptLong_0.1.8 ggplot2_3.2.1 ggpubr_0.2.4
ggrepel_0.8.1 ggsci_2.9 ggsignif_0.6.0
GlobalOptions_0.1.1 glue_1.3.1 graphics_3.6.2
grDevices_3.6.2 grid_3.6.2 gridExtra_2.3
gtable_0.3.0 highr_0.8 htmltools_0.4.0
htmlwidgets_1.5.1 httpuv_1.5.2 igraph_1.2.4.2
jsonlite_1.6 knitr_1.27 labeling_0.3
later_1.0.0 lattice_0.20.38 lazyeval_0.2.2
lifecycle_0.1.0 magrittr_1.5 markdown_1.1
MASS_7.3.51.5 Matrix_1.2.18 methods_3.6.2
mgcv_1.8.31 mime_0.8 munsell_0.5.0
nlme_3.1.143 parallel_3.6.2 pillar_1.4.3
pkgconfig_2.0.3 plogr_0.2.0 plyr_1.8.5
png_0.1-7 polynom_1.4.0 promises_1.1.0
purrr_0.3.3 R6_2.4.1 RColorBrewer_1.1-2
Rcpp_1.0.3 Rdpack_0.11-1 reshape2_1.4.3
rje_1.10.13 rjson_0.2.20 rlang_0.4.4
rmarkdown_2.1 scales_1.1.0 shape_1.4.4
shiny_1.4.0 sourcetools_0.1.7 splines_3.6.2
stats_3.6.2 stringi_1.4.5 stringr_1.4.0
tibble_2.1.3 tidyr_1.0.2 tidyselect_1.0.0
tinytex_0.19 tools_3.6.2 usefun_0.4.3
utf8_1.1.4 utils_3.6.2 vctrs_0.2.2
viridisLite_0.3.0 visNetwork_2.0.9 withr_2.1.2
xfun_0.12 xtable_1.8.4 yaml_2.2.0
zeallot_0.1.0
r